Skip to content

Scal xyb#556

Merged
aliabdolali merged 31 commits into
NOAA-EMC:developfrom
erdc:scal_xyb
Dec 30, 2021
Merged

Scal xyb#556
aliabdolali merged 31 commits into
NOAA-EMC:developfrom
erdc:scal_xyb

Conversation

@aronroland
Copy link
Copy Markdown
Collaborator

@aronroland aronroland commented Dec 7, 2021

Pull Request Summary

A short overview of the PR
Remove XYB and replaced by double precision XGRD, YGRD and single precision ZB
https://github.com/NOAA-EMC/WW3/projects/2

Description

XYB is keeping the XY-coordinates and the depths, this array is deleted for the sake of memory saving and towards the integration of the scalability branch into develop. We rather use XGRD, YGRD and ZB instead. XGRD and YGRD are converted to double precision since this is needed by the unstructured part of WW3 for the accurate calculation of the geometric quantities of the unstructured triangular grid. The changes have been implemented in such a way that the structured part of the code and the non-decomposed part of the unstructured grid remains constant. The use of single precision in lat/lon coordinates may also be a bug for the structured, curvilinear, triangular or smc grid if applied to very high resolution.

Please also include the following information:

  • Add any suggestions for a reviewer
    enhancement
  • Are answer changes expected from this PR? Please describe the changes and the reason why in addition to which of the mod_def change only for the unstructured part

Issue(s) addressed

  • Please list any issues associated with this PR, including those the PR will fix/close. For example:

Commit Message

Remove XYB and replaced by double precision XGRD, YGRD and single precision ZB

Check list

Testing

  • How were these changes tested? Regression test on NOAA HPC
  • Are the changes covered by regression tests? Yes
  • Have the matrix regression tests been run? Yes, Intel on NOAA HPC

These two tests are not expected. Aron has them identical with Intel and Ali has them different on both Orion and Hera using Intel. @thesser1 could you test this test on ERDC HPC or your computer? One does not need mpi.

ww3_tr1/./work                     (1 files differ)
ww3_tr1/./work_std                     (1 files differ)

pre-known different cases:

**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR2_UQ_MPI_d2                     (6 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (6 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (8 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (12 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (6 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (8 files differ)
mww3_test_07/./work_PR3_UQ                     (4 files differ)
ww3_tp2.10/./work_MPI_OMPH                     (6 files differ)
ww3_tp2.16/./work_MPI_OMPH                     (4 files differ)
ww3_ufs1.3/./work_a                     (2 files differ)

Expected changes in mod_def files for unstructured cases:

ww3_tp2.17/./work_b                     (1 files differ)
ww3_tp2.17/./work_mb                     (1 files differ)
ww3_tp2.17/./work_mc                     (1 files differ)
ww3_tp2.17/./work_a                     (1 files differ)
ww3_tp2.17/./work_ma1                     (1 files differ)
ww3_tp2.17/./work_c                     (1 files differ)
ww3_tp2.17/./work_ma                     (1 files differ)
ww3_tp2.17/./work_mc1                     (1 files differ)
ww3_tp2.18/./work_TIDE_MPI                     (1 files differ)
ww3_tp2.6/./work_ST4                     (1 files differ)
ww3_tp2.6/./work_pdlib                     (1 files differ)
ww3_tp2.6/./work_ST0                     (1 files differ)
ww3_tp2.7/./work_ST0                     (1 files differ)
ww3_ts4/./work_ug_MPI                     (1 files differ)
  • Please indicate the expected changes in the regression test output (Note the known list of non-identical tests).
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt):

matrixCompFull.txt
matrixCompSummary.txt
matrixDiff.txt

aronroland and others added 22 commits November 5, 2021 16:43
@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@aronroland Thanks for these nice updates - I can definitely see how this could impact any high-resolution case. I'm curious if you or @aliabdolali looked into the impact of this change (beyond not changing answers) for structured multi-grid test cases that are already memory intensive, for example, the fully resolution test case example in ww3_ufs1.1. Does this dramatically affect the memory usage for that test? This is important information as double versus single precision is not an option in the way this change is introduced.

Copy link
Copy Markdown
Collaborator

@JessicaMeixner-NOAA JessicaMeixner-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that build_utils.sh and ft2src.sh were changed? This seems unrelated to this PR along with the additional comp/link files.

Comment thread model/bin/comp.ite Outdated
Comment thread model/src/w3gdatmd.F90 Outdated
Comment thread model/src/w3gridmd.F90 Outdated
Comment thread model/src/wmgridmd.F90 Outdated
Comment thread model/src/wmgridmd.F90 Outdated
Comment thread model/src/wmgridmd.F90 Outdated
Comment thread model/src/wmgridmd.F90 Outdated
Comment thread model/src/ww3_ounf.F90 Outdated
Comment thread model/src/ww3_ounf.F90 Outdated
DEALLOCATE(MX1, MXX, MYY, MXY, MAPOUT)
DEALLOCATE(MX1R, MXXR, MYYR, MXYR)
DEALLOCATE(AUX1)
!AR: This needs to be cleaned for the scalability part ..
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up comment?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned!

Comment thread model/src/w3gridmd.F90 Outdated
@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@aliabdolali @aronroland there are still a few unresolved issues with this PR from the comments I made in the review last week. There are also the open question as to why the permissions are being changed on a few files. Lastly, there is the question on the impact of the memory since this is not added as an option. Please let me know if I need to look into that myself for impacts to the operations of the global systems here at NCEP.

@aliabdolali
Copy link
Copy Markdown
Contributor

@aliabdolali @aronroland there are still a few unresolved issues with this PR from the comments I made in the review last week. There are also the open question as to why the permissions are being changed on a few files. Lastly, there is the question on the impact of the memory since this is not added as an option. Please let me know if I need to look into that myself for impacts to the operations of the global systems here at NCEP.

@JessicaMeixner-NOAA the way @aronroland and I isolated and tested the scal_xyb does not affect the runtime and performance/outputs of the structured part of the code. As a sanity check, I checked the ufs1.* cases and there is no diff in the runtime. The reason we isolated the work is even one single change in the unstructured case made some changes in the structured part which sometimes takes hours to track and understand, sometimes impossible, therefore delaying the scalability reintegration and further development.
Aron can comment more here.

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

JessicaMeixner-NOAA commented Dec 15, 2021

@aliabdolali it's great to hear runtime is unaffected. How about the memory usage in the ufs1.1 with the high resolution grids? Again, if I need to do this work, I can I just need to know.

@aronroland
Copy link
Copy Markdown
Collaborator Author

aronroland commented Dec 15, 2021 via email

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@aronroland there are a variety of ways to check the memory, including using the MEMCHECK switch you implemented. Some HPCs give memory output and we can also check the resulting size of files. Looks like I should just do these checks myself at this point. I hope to have information for you in a week if not before.

@aronroland
Copy link
Copy Markdown
Collaborator Author

aronroland commented Dec 15, 2021 via email

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

For the ww3_tr1 files that are different, it's the mod_def files. I wonder if something is written to the mod_def files that are changed by the changes made here or alternatively if this is a case of the memory issues that we seem to occasionally get. One question would be if we run this branch twice, does that test give different answers?

@aronroland
Copy link
Copy Markdown
Collaborator Author

aronroland commented Dec 17, 2021 via email

@ukmo-ccbunney
Copy link
Copy Markdown
Collaborator

Hi @JessicaMeixner-NOAA and @aronroland
Regarding he differences you are seeing in the mod_def.ww3 file for ww3_tr1, I have encountered this before and tracked it down to an uninitialized variable. See #425

Seeing this jogged my memory - I thought a fix had been merged for that Issue, but I don't believe it has.
Perhaps you could test whether it fixes your problem (the details of the fix are in the issue) and I will raise a PR for it.

Cheers.

@aronroland
Copy link
Copy Markdown
Collaborator Author

This is great that u have reported that. Many thanks for this!

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

The size of the output files did not change, additionally the total memory usage on all of the ww3_ufs* tests increase with this PR, but not so much so that I think this warrants needing to put in an option for this variable. If anyone is concerned about total memory usage for their applications please test this PR and make sure it's okay (@mickaelaccensi @ukmo-ccbunney & anyone else).

@aronroland
Copy link
Copy Markdown
Collaborator Author

aronroland commented Dec 18, 2021 via email

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@aronroland I looked at the resulting file sizes and I used slurm commands to see the total memory usage of the regression test jobs for the ww3_ufs* tests. The increase in total memory used was about .1GB or less. I didn't save output.

@aliabdolali
Copy link
Copy Markdown
Contributor

For the ww3_tr1 files that are different, it's the mod_def files. I wonder if something is written to the mod_def files that are changed by the changes made here or alternatively if this is a case of the memory issues that we seem to occasionally get. One question would be if we run this branch twice, does that test give different answers?

I checked it this morning, running tr1 twice ended with a different mod_def :(

@aliabdolali
Copy link
Copy Markdown
Contributor

After ww3_tr1 fix and resolving comments, I reran the regtests and here is the summary:

pre-known different cases:

**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR2_UQ_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (6 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (8 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (10 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (8 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (8 files differ)
mww3_test_07/./work_PR3_UQ                     (1 files differ)
ww3_tp2.10/./work_MPI_OMPH                     (7 files differ)
ww3_tp2.16/./work_MPI_OMPH                     (5 files differ)
ww3_ufs1.3/./work_a                     (1 files differ)

Expected changes in mod_def files for unstructured cases:

ww3_tp2.17/./work_b                     (1 files differ)
ww3_tp2.17/./work_mb                     (1 files differ)
ww3_tp2.17/./work_mc                     (1 files differ)
ww3_tp2.17/./work_a                     (1 files differ)
ww3_tp2.17/./work_ma1                     (1 files differ)
ww3_tp2.17/./work_c                     (1 files differ)
ww3_tp2.17/./work_ma                     (1 files differ)
ww3_tp2.17/./work_mc1                     (1 files differ)
ww3_tp2.21/./work_b                     (1 files differ)
ww3_tp2.21/./work_mb                     (1 files differ)
ww3_tp2.21/./work_a                     (1 files differ)
ww3_tp2.21/./work_ma                     (1 files differ)
ww3_tp2.6/./work_ST4                     (1 files differ)
ww3_tp2.6/./work_pdlib                     (1 files differ)
ww3_tp2.6/./work_ST0                     (1 files differ)
ww3_tp2.7/./work_ST0                     (1 files differ)
ww3_ts4/./work_ug_MPI                     (1 files differ)

@aliabdolali aliabdolali merged commit 7e006d9 into NOAA-EMC:develop Dec 30, 2021
@thesser1 thesser1 deleted the scal_xyb branch October 26, 2023 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

double precision is needed for high resolution applications

4 participants